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ABSTRACT 

Gossip algorithms for aggregation have recently received sig- 
nificant attention for sensor network applications because of 
their simplicity and robustness in noisy and uncertain envi- 
ronments. However, gossip algorithms can waste significant 
energy by essentially passing around redundant information 
multiple times. For realistic sensor network model topolo- 
gies like grids and random geometric graphs, the inefficiency 
of gossip schemes is caused by slow mixing times of ran- 
dom walks on those graphs. We propose and analyze an 
alternative gossiping scheme that exploits geographic infor- 
mation. By utilizing a simple resampling method, we can 
demonstrate substantial gains over previously proposed gos- 
sip protocols. In particular, for random geometric graphs, 
our algorithm computes the true average to accuracy l/n" 
using 0(n^'^ v'log n) radio transmissions, which reduces the 

energy consumption by a ^Jj^^ factor over standard gossip 

algorithms. 

Categories and Subject Descriptors: F.2.2, G.3 
General Terms: algorithms 

Keywords: gossip algorithms, random geometric graphs, 
sensor networks, distributed consensus, distributed aggre- 
gation 

1. INTRODUCTION 

Consider a network of n sensors, in which each node col- 
lects a measurement in some modality of interest (e.g., tem- 
perature, light, humidity etc.). It is frequently of interest 
to solve the averaging problem: namely, to develop a dis- 
tributed and fault-tolerant algorithm by which all nodes can 
compute the average of all n sensor measurements. Gossip 
algorithms solve the averaging problem by having each node 
randomly pick one of their one-hop neighbors and exchange 
their current values. The pair of nodes compute the pair- 
wise average, which then becomes the new value for both 
nodes. By iterating this pairwise averaging process, it is 
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easy to show that all the nodes converge to the global av- 
erage in a completely distributed manner. Although fairly 
simple, the distributed averaging problem and related con- 
sensus problems can be viewed as building blocks for solving 
more complex problems [19,21], including computing gen- 
eral linear functions as well as optimization of non-linear 
functions in sensor networks. 

The key issue is how many iterations it takes for such 
gossip algorithm to converge to a sufficiently accurate esti- 
mate. Variations of this problem have received significant 
attention in recent work [4,5,11,12]. The convergence speed 
of a nearest-neighbor gossip algorithm, known as the aver- 
aging time, turns out to be closely linked to the mixing time 
of the Markov chain defined by a weighted random walk 
on the graph. Boyd et al. [4] showed how to optimize the 
neighbor selection probabilities for each node so to find the 
fastest-mixing Markov chain on the graph. For certain types 
of graphs, including complete graphs, expander graphs and 
peer-to-peer networks, such Markov chains are rapidly mix- 
ing, so that gossip algorithms converge very quickly. 

Unfortunately, for the graphs corresponding to typical 
wireless sensor networks, even an optimized gossip algorithm 
can result in very high energy consumption. For example, 
a common model for an wireless sensor network is a ran- 
dom geometric graph [17], in which all nodes communicate 
with neighbors within a radius r. With the transmission ra- 
dius scaling in the standard way as r(n) = 0{^J ^^^), even 

an optimized gossip algorithm requires B(n^) transmissions 
(see section [2.311 . which is of the same order as the energy 
required for every node to flood its value to all other nodes. 
This problem is noted in [4]: "In a wireless sensor network. 
Theorem 6 suggests that for a small radius of transmission, 
even the fastest averaging algorithm converges slowly" , and 
it seems to be fundamental for gossip algorithms on these 
graphs. Intuitively, the nodes in a standard gossip protocol 
are essentially "blind", and they repeatedly compute pair- 
wise averages with their one-hop neighbors. Information 
only diffuses slowly throughout the network, roughly mov- 
ing distance y/k in k iterations (as a random walk). 

Accordingly, the goal of this paper is to develop and ana- 
lyze alternative — and ultimately more efficient — methods 
for solving distributed averaging problems in wireless net- 
works. We leverage the fact that sensors nodes typically 
know their locations, and can therefore use this knowledge 
to perform geographic routing. Localization is a well stud- 
ied problem (e.g., [13,20]), since geographic knowledge is 
required in numerous applications. With this perspective in 
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Figure 1: Illustration of a random geometric graph. 
The solid lines represent graph connectivity, and 
the dotted lines show the Voronoi regions associated 
with each node. 

mind, we propose an algorithm that, like a standard gos- 
siping protocol, is completely randomized, distributed and 
robust, but requires substantially less communication by ex- 
ploiting geographic information. The idea is that instead of 
exchanging information with one-hop neighbors, geographic 
routing can be used to gossip with random nodes who are 
far away in the network. We show that the extra cost of 
multi-hop routing is compensated by the rapid diffusion of 
information. 

The remainder of this paper is organized as follows. In 
Section 121 we provide a precise statement of the distributed 
averaging problem, describe our algorithm, and state our 
main results on its performance. Section |21 contains proofs 
of these technical results. In Section 2] we experimentally 
evaluate the performance of our algorithm. 

2. PROPOSED ALGORITHM AND MAIN 
RESULTS 

2.1 Problem statement 

2.1.1 Graph model 

Following previous work [4,8], we model our wireless sen- 
sor network as a random geometric graph [17] . In this model, 
denoted G(n, r), the n sensor locations are chosen uniformly 
and independently in the unit square, and each pair of nodes 
is connected if their Euclidean distance is smaller than some 
transmission radius r. (As discussed in Section|S] our results 
have natural analogs for lattices, and other graph structures 
that are reasonable models of wireless networks). It is well 
known [7,8,17] that in order to have good connectivity and 
minimize interference, the transmission radius r(n) has to 

scale like Q{\J ^^^)- For our analysis, we assume that com- 
munication within this transmission radius always succeeds. 
Note however that the proposed algorithm is very robust to 
communication and node failures. 

2.1.2 Time model 



We use the asynchronous time model [4], which is well- 
matched to the distributed nature of sensor networks. More 
precisely, it is assumed that each sensor node has a clock 
which ticks independently as a rate A Poisson process. Con- 
sequently, the inter-tick times are exponentially distributed, 
and independent across nodes and across time. This set-up 
is equivalent to a single clock ticking according to a rate n\ 
Poisson process at times Zk- On average, there are approx- 
imately n clock ticks per unit of absolute time (an exact 
analysis can be found in [4]) but we will always be measur- 
ing time in number of ticks of this (virtual) global clock. 
Time is discretized, and the interval [Zk,Zk+i) corresponds 
to the fcth timeslot. We can adjust time units relative to 
the communication time so that only one packet exists in 
the network at each time slot with high probability. 

2.1.3 Distributed averaging 

At time slot k — 0, 1, 2 . . ., each node i = 1, . . . , n has 
an estimate Xi{k) of the global average, and we use x{k) to 
denote the n-vector of these estimates. The ultimate goal 
is to drive the estimate x{k) to the average aiavel, where 
Xave := ^ X^iLi 2;i(0), using the minimal amount of com- 
munication. For the algorithms of interest to us, the quan- 
tity x{k) for > is a random vector, since the algorithms 
are randomized in their behavior. Accordingly, we measure 
the convergence of x{k) to a;(0) in the following sense [4, 12] 
(essentially convergence in probability): 

Definition 1. Given e > 0, the t-averagmg time is the 
earliest tvme at which the vector x{k) is e close to the nor- 
malized true average with probability greater than 1 — e; 

^..(„.,..„pi„,{..,( ii-w-;--rii , .),.}. 

(1) 

where || • II2 denotes the £2 norm. 

Let R{k) represent the number of one-hop radio transmis- 
sions required for a given node to communicate with some 
other node at time click k. In a standard gossip protocol, 
the quantity R{k) = R is simply a constant, whereas for 
our protocol, R{k) will be a random variable (with identical 
distribution for each node). The total communication cost 
is measured by the random variable 

T„„,(n,e) 

C(n,6)= Yl (2) 
fe=i 

In this paper, we first analyze the expected communication 
cost, denoted by £{n, e), which is given by 

£{n,e)=E[R{k)]Ta^e{n,e) . (3) 

In addition, we provide a upper bound on the communica- 
tion cost, denoted by ©(n, e), such that 

P{c(n,e)>2?(n,e)} < I . (4) 

2.2 Proposed Algorithm 

The proposed algorithm combines gossip with geographic 
routing. The key assumption is that each node knows its 
geographic location. With that knowledge, every node can 
also learn the locations of its one-hop neighbors by having 
just one transmission per node. 



Suppose the j-ih clock to tick belongs to node s. Let l{s) 
denote the location of node s. Node s activates and does 
the following: 

1. Node s chooses a point uniformly in the unit square. 
Call this the target t. Node s forms the tuple rus = 

{Xs{j),l{s),t). 

2. Node s sends to its one-hop neighbor closest to t, if 
any exists. If node r receives a packet rUs , it sends 

to its one-hop neighbor closest t. Greedy geographic 
routing terminates when a node receives the packet 
and has no one-hop neighbors with distance smaller to 
the random target that its own. Let v be the node 
closest to t. 

3. Node V makes an independent randomized decision to 
accept rUs- If the packet is accepted, v computes its 
new value Xv{j + 1) — (a;t,(j) + a;s(j))/2 and a message 

= {xv{j),l{v),l{s)) is sent back to s via greedy 
geographic routing. Node s computes Xs{j + 1) — 
{^■v{j) + Xs{j))/2, and the round ends. 

4. If the packet is rejected, v chooses a new point t' uni- 
formly in the plane and repeats steps|^]|3with message 
■m's = {xs{j),l{s),t'). 

We will refer to this procedure as a gossip round. Our 
analysis of this randomized algorithm, given in Section |3 
consists of the following steps. First, we prove that when 

r{n) = 0( Y^iSfii)^ greedy routing always reaches the closest 
node V to the random target in j^^) radio transmis- 
sions. Note that in practice more sophisticated geographic 
routing algorithms (e.g., [10]) can be used to ensure that the 
packet approaches the random target when there are "holes" 
in the node density. However, greedy geographic routing is 
good enough for our model and other choices for routing 
algorithms will not affect our results. 

Our randomized procedure induces a probability distribu- 
tion over the chosen sensor v (i.e., the one closest to the ran- 
domly chosen target). If this distribution were uniform, then 
it follows immediately that the averaging time Tave{n,e) is 
0{n log e~^). In actuality, the probability of choosing sensor 
V is equal to a„, the area of its associated Voronoi region. 
The distribution of Voronoi regions is not very uniform, so in 
order to bound the averaging time Tave{n, e), we apply rejec- 
tion sampling in order to temper the distribution. In par- 
ticular, we apply the following rejection sampling scheme, 
due to Bash et al. [2]. Let a be an n- vector of areas of the 
sensors' Voronoi regions. We set a threshold r on the cell 
areas. Sensors with cell area smaller than t always accept 
a query, and sensors with cell areas larger than r reject the 
query with a certain probability. The rejection sampling 
method protects against oversampling and limits the num- 
ber of undersampled sensors, and allows us to prove that 
Tav.{n,e) = C»(nloge''), even for this perturbed distribu- 
tion. 

Of course, the rejection sampling scheme requires some 
random number Q of queries before a sensor accepts. In 
terms of the number of queries, the total number of radio 
transmissions for the fcth gossip round is 



R{k) = O Q 



log n 



Therefore if Tave gossip rounds take place overall, the ex- 
pected of radio transmissions will be 



£{n, 



loe 



(6) 



Accordingly, a third key component of our analysis in Sec- 
tion |21 is to show that the probability of acceptance remains 
larger than a constant, which allows us to upper bound the 
expectation of the geometric random variable Q. We also 
prove an upper bound on the maximum value of Q over Tave 
rounds that holds with probability greater than 1 — e/2. 

Putting these pieces of the analysis together, the main 
result of this paper is that under the proposed geographic 
gossip algorithm 



Tave{n,€) = 0(n log(l/e)) 



(7) 



and therefore the total cost for computing the average with 
geographic gossip is 



£{n,e) = O 



.3/2 



:loge ^ 



(8) 



VVIogn 

Moreover, note that if we set e = in equation (|HJ, then 

we obtain £{n, 1/n") = O [n^''^ y/\ogn^ . 

2.3 Related work and Comparisons 

In a series of papers [3,4], Boyd et al. have analyzed the 
performance of standard gossip algorithms. Their fcistest 
standard gossip algorithm for the ensemble of random geo- 
metric graphs G{n, r) has a e-averaging time [4]^^ Tave{n, e) = 
8(n '°^^jj ). For the r{n) in this paper this averaging time 

is Q{^^^\oge~^). For e scaling like n~°' for any a > 0, 
this averaging time scales likes O(n^). Note that in stan- 
dard gossip, each gossip round corresponds to communica- 
tion with only one-hop neighbor and hence costs only one 
radio transmission which means that the fastest standard 
gossip algorithm will have a total cost £{n) = Q{n?) radio 
transmissions for e = Q{n~'^). Therefore, our proposed al- 
gorithm saves a factor of y j^^j^ in communication energy 

by exploiting geographic information. 

Two very recent papers by Moallemi and Van Roy [14] 
and Mosk-Aoyama and Shah [15] also consider the problem 
of computing averages in networks. The consensus propaga- 
tion algorithm of [14] is a modified form of belief propagation 
that attempts to mitigate the inefficiencies introduced by 
the "random walk" in gossip algorithms. However, their re- 
sults, although promising, have only been proven for regular 
graphs, and it is unclear whether their algorithm will prove 
efficient for the networks in this paper. In [15], the authors 
use an algorithm based on Flajolet and Martin [6] to com- 
pute averages and bound the averaging time in terms of a 
"spreading time" associated with the communication graph. 
However, they only show the optimality of their algorithm 
for a graph consisting of a single cycle, so it is currently 
difficult to speculate how it would perform on a geometric 
random graph. 

In [1] the authors consider the related problem of comput- 
ing the average of a network in a single node. They propose 



(5) 



^This quantity is computed in section IV. A of [4] but the 
result is expressed in terms of absolute time units which 
needs to be multiplied by n to become clock ticks. 



a distributed algorithm to solve this problem and show how 
it can be related to cover times of random walks on graphs. 



3. ANALYSIS 

3.1 Routing in o(i/r(n)) 

We first need some simple lemmas about the network con- 
nectivity and the feasibility of greedy geographic routing. 

Lemma 1 (Network connectivity). Let a graph be 
drawn randomly from the geometric ensemble G{n,r) defined 
in Section and a partition be made of the unit area 

into squares of length a{n) — \J 2^^^ . Then the following 
statements all hold with high probability: 

(a) Each square contains at least one node. 



(b) Ifr{n) — ^ 10 i^^^, then each node will be able to com- 
municate to a node in the four adjacent squares. 

(c) All the nodes in each square are connected with each 
other. 

Proof. The proof of part (a) following easily since it re- 
quires 0(n log n) balls thrown randomly to cover n bins with 
high probability. (See [16] and [7] for more details). More- 
over, if we select r(n) = y/Za{n), then simple geometric 
calculations show that each node will be able to communi- 
cate to all other nodes in its square, as well as all nodes in 
the four adjacent squares. ■ 



Lemma 2 (Greedy geographic routing). Suppose 
that a node target location is chosen in the unit square. Then 
greedy geographic routing will route to the node closest to the 
target m 0{l/r{n)) = steps. 

Proof. By Lemma 0^ a), every square of of side length 
a(n) — y 2^21-2: jg occupied by at least a node. Therefore, 
we can perform greedy geographic routing by first match- 
ing the row and then the column of the square which con- 
tains the target, which requires at most = ^^\J T^iTi) 
hops. After reaching the square where the target is con- 
tained, Lemma0^c) guarantees that the subgraph contained 
in the square is completely connected. Therefore, one more 
hop suffices to reach the node closest to the target. ■ 

These routing results allow us to bound the cost in hops 
for an arbitrary pair of nodes in the network to exchange 
values. In the next section, we describe a rejection sampling 
method used to reduce the nonuniformity of the distribution 
(induced by sampling locations rather than sensors). 

3.2 Rejection sampling 

As mentioned in the previous section, sampling geographic 
locations uniformly induces a nonuniform sampling distri- 
bution on the sensors in which a sensor v is queried with 
probability proportional to the area a« of its Voronoi cell. 
However, by judiciously rejecting queries, the sensors with 
larger Voronoi areas can ensure that they are not oversam- 
pled. We adopt the following sampling scheme [2]: given 
some threshold r > 0, sensor v accepts the request with 
probability 
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Figure 2: Rejection sampling in pictures. The total 
shaded area is the probability of a query being re- 
jected. The new sampling distribution is given by 
the white histogram, appropriately renormalized. 



We can then calculate the probability that sensor v is 
sampled: 



qv = 



min(r, a^) 
EILimin(r, at) 

min(r, a„) 



\{t : at > r}| ■ r + Et:at<r"t 



(10) 



Of more importance to us is the denominator of g„, which 
is the total chance that a query is accepted: 



1 J = : a„ > r}|r + ^ a„ 



(11) 

Let Q denote the total number of requests made by a sensor 
before one is accepted. 

A graphical picture of rejection sampling on the graph of 
Voronoi cells is shown in Figure 13.21 Rejection sampling 
"slices" the histogram at r, and renormalizes the distribu- 
tion accordingly. The total area that is sliced off is equal to 
1 — Pa, the probability that a query is rejected. Thus we 
can see that if r is chosen to be too small, the probability 
of rejection will become very large. In Lemma |3 we show 
that choosing r = Q{n~^) will keep the rejection probability 
suitably bounded away from 1, so that the expected number 
of queries E[(3] will be finite. In particular, we choose r such 
that 



(a„ < r) = min 



(12) 



(9) 



The constants v and /i control the undersampling and over- 
sampling respectively. With this choice of r, the results of 
Bash et al. [2] ensure that no sensor is sampled with prob- 
ability greater that (1 -I- /x)/n and no more than vn sensors 
are sampled with probability less than 1/n. The following 
result establishes that the acceptance probability remains 
sufficiently large: 

Lemma 3. For r = cn~^ , we have P(ai, > r) > 1 — 4c. 

Proof. We use a simple geometric argument to lower 
bound P{av > r). Consider a node s such that a circle of 
area r it lies entirely within its Voronoi region, as shown 
in Figure Clearly, such nodes are a subset of those with 
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Figure 3: Inscribing circles in Voronoi cells. 



area larger than r. Let the radius of this circle be r This r 
is at most twice the distance to the closest node. Thus in 
order to inscribe a circle of radius r in the Voronoi region, 
all other nodes must lie outside a circle of radius 2r around 
the node. This larger circle has area 4r, so 

P(a, > r) > (1 - 4r)""' = (1 - 4cn~')"~' > 1 - 4c. (13) 

Thus, by appropriate choice of c, we can make the accep- 
tance probability arbitrarily close to 1. ■ 



Finally, we need to bound the expected number of re- 
jections and the maximum number of rejections in order 
to bound the expected number of transmissions and total 
transmission time. Recall that Q is the number of queries 
that a sensor has to make before one is accepted, and has 
distribution 



^(0 = t) =Pa(l-Pa) 



(15) 



Lemma 5. For a fixed {^,,v), rejection sampling leads to 
a constant number of expected rejections. 

Proof. The random variable Q is just a geometric ran- 
dom variable with parameter Pa, so we can write its mean 
as: 



Our next step is to bound the distance between the new 
sampling distribution q and the uniform distribution n~^l. 
This will be used in next section to bound the second eigen- 
value of a matrix associated with the gossip algorithm. 

Lemma 4. For any e > 0, there exists constants jj, > 
and V > such that rejection sampling with parameters 
{fj,, v) leads to 



n 



< e (14a) 

< -^e . (14b) 



Proof. Given e > 0, choose and fj, such that v -\- < e 
and V + fj? < e^. We then expand the error function and 
use the properties given by the sampling scheme. 



E 



qv 



E 



qv 



E 



Now we use the properties of rejection sampling. On the set 
{v : av < r} we have ^ > r, so we can upperbound the 
error by -i. Furthermore, we know \{v : Uv < t}\ < /m. On 
the set {v : Ov > r} we know qv is constant and 1/n < qv < 
{1 + v)/nhy construction. Thus 



E 



qv 



< 

< ly + H 



1 fl + ii 1 
un h n 

n \ n, n 



which is less than e by our choice of v and fi. 



J=l 
1 

1 



\{V : av > r}\T + J2v:a^<T"-^ 

^ =0(1). 



(1 — i/)rn 

since r = 0(n~^) by construction. ■ 

Lemma 6. Let {Qk : k = 1,2, . . . K} be a set of iid ran- 
dom variables identitically distributed according to Q. For a 
fixed {fi,u), rejection sampling gives 



max Qk = 0(log if -f log e ^) 



(16) 



l<k<K 

With probability greater than 1 — e/2. 

Proof. For any integer m > 2, a straightforward com- 
putation yields that 



< m) = ^ P„ (1 - Pa)'-^ = 1 - (1 - P„) 



Therefore we have 

P(maxQfc < m) = [l - (1 - Pa)""] 

= [l-exp(mlog(l-P,))]'^ . 

We want to choose m = m{K, e) such that this probability 
is greater than or equal to 1— e/2. First set m = ~P iog°fJp ) i 
where p is to be determined. Then we have 

P(maxQfe < m) = [1 - l/K"]'^ . 



We now need to choose p > 1 such that 

[l-l/Rf]'^ > 1-6/2 , 

or equivalently, such that 

l-ll-l/K"]'^ < e/2 . 

Without loss of generality, let K be even. Then by convexity, 
we have (1 — j/)^ > 1 — Ky. Apply this with y = 1/K'' to 
obtain 

1- [1- 1/7^-"]'^ < l/K''-\ 

Hence we need to choose p > log(2/e)/ log .R" + 1 for the 
bound to hold. Thus, if we set 

then with probability greater than 1 — e/2, all K rounds of 
the protocol will use less than m rounds of rejection. 
■ 

3.3 Averaging with gossip 

As with averaging algorithms based on pairwise updates 
[4], the convergence rate of our method is controlled by the 
second largest eigenvalue X2{W) of the matrix 



2n I 



W ■.= I + —\P + P' -D 



where D is diagonal with entries Di = (X]"=i[^»j + Pji])- 
The (j,j)-th entry of the matrix P is the probability that 
node i exchanges values with node j. Without rejection 
sampling, Pij — aj, and with rejection sampling, Pij = qj. 
With this notation, we are now equipped to state and prove 
the main result of the paper: 

Theorem 1. The geographic gossip protocol with rejec- 
tion threshold t — Q{n~'^ ) has an averaging time 



Tavein,e) = 0(nlog(l/e)). 



(17) 



Proof. To establish this bound, we exploit Theorem 3 
of [4], which states that the e-averaging time is 



Taveie,P)=Q 



loge ^ 



log A2(VK)-i 



(18) 



Thus, it suffices to prove that log A2(M^) — ^l{l/n) to estab- 
lish the claim. 

The probability of any sensor choosing sensor v is just 
so that the matrix P = l(f ■ Note that the diagonal matrix 
D has entries 

n n n 

A = Y.'^P^l + Pji) = ^ + ^ g» = 1 + ng, . 

Thus, we can write W in terms of outer products as: 

W=[l~ diag(r + ngO) + ^(l<f + <fF) . (19) 

Note that the matrix W is symmetric and positive semidef- 
inite. 

We claim that the second largest eigenvalue X2{W) — 
0(1 — c/n), for some constant c. By Taylor series expan- 
sion, this will imply that logA2(M^) — Q{n~^) as desired. 
To simplify matters, we transform the problem to finding 



the maximum eigenvalue of an alternative matrix. Since W 
is doubly stochastic, its largest eigenvalue is 1 and corre- 
sponds to the eigenvector Vi — n~^^^l. Consider the matrix 
using equation 1191 , it can be decomposed 

as 

W' = D' + Q', 
where D' = [I ~ (2n)^^ diag(l + nq)) is diagonal and 

0' = ^ (1(9 - n-'lf + (q- n-^l)F) 
is symmetric. 

Note that by construction, the eigenvalues of W' are sim- 
ply 

X{W) = {{l^^,X2{W),...,Xn{W)}. 

On one hand, suppose that Xi{W') > A2(H^); in this case, 
then (1 — > A2(H^) and we are done. Otherwise, we have 

Ai(W') = A2(W) . 

Note that W' is the sum of a diagonal matrix and a sym- 
metric matrix with small entries. Weyl's theorem [9, p. 181] 
guarantees that 

Xi{W') < Ai(D') + Ai(Q') < (^1 - ^) + •^i(<3') ■ 

It is therefore sufficient to bound Xi{Q'). We do so using 
the Rayleigh-Ritz theorem [9, p. 176], the Cauchy-Schwartz 
inequality, and Lemma 2] as follows: 

Ai(Q') = max if^Q'y 

y-\\y\\2='^ 

= 77- max {l{q ~ n'^ 1)'^ + {q-n~^ 1)1^ y 

1 -T7/-^ -'17\T^ 

— — max y l{n — n 1) y 

ny:m\2=i ^ ' 

n a:||a||2=i 

n \ y/n 
_ 1 
n 



Now we have the total bound 



2n n 



(20) 



We can choose e < 1/4 using Lemma 2] to get the desired 
bound. I 

The preceding theorem shows that by using rejection sam- 
pling we can bound the convergence time of the gossip algo- 
rithm. We can therefore bound the number of radio trans- 
missions required to estimate the average: 

Corollary 1. The expected number of radio transmis- 
sions required for our gossip protocol on the geometric ran- 

logii I 



(21) 



dom graph G{n, J -^f^) is upper hounded 



£{n,e) = O 



Vlog n 



Moreover, with probability greater than l — e/2, the maximum 
number of radio transmissions is upper bounded 



V{n,e) = O £:(n,e)[logn + loge" 



(22) 



Remark: Note that for e = n " for any a > 0, our bounds 
are of the form £:(n, l/n") = ©(n^/^yiflgn) and V{n,e) = 
0(n='/2iog3/2n). 

Proof. We just have to put the pieces together. If we 
assume an asynchronous protocol, the cost per transmission 
pair is given by the product of 0(yn/ log n) from routing, 
E[(5] from rejection sampling, and the averaging time Tave- 
From Lemma 1^1 E[Q] = 0(1). Using equation 11811 and 
Theorems we can bound \og\2{W)~^ by (1 - X2{W)) = 
0(n~^). Thus, the expected number of communications is 



log n 



E[Q]nloge"' ) = O 



-,3/2 



( Vlog n 



(23) 



To upper bound the maximum number of transmissions with 
high probability, we note that Lemma |S| guarantees that 

^_ max Qfc = 0{logTave + loge"^) 

with high probability. Using Theorem Q we can see that 
0(logTa„e + loge"^) = 0(logn + loge"^). Consequently, 
with probability greater than 1 — e/2. 



V{n,e) = o(£:(n,e)[logn + loge-'] 



(24) 



4. SIMULATIONS 

Note that the averaging time is defined in equation Q 
is a conservative measure, obtained by selecting the worst 
case initial field a;(0) for each algorithm. Due to this con- 
servative choice, an algorithm is guaranteed to give (with 
high probability) an estimated average that is e close to the 
true average for any choices of the underlying sensor observa- 
tions. As we have theoretically demonstrated, our algorithm 
is provably superior to standard gossiping schemes in terms 
of this metric. In this section, we evaluate our geographic 
gossip algorithm experimentally on specific fields that are 
of practical interest. We construct three different fields and 
compare geographic gossip to the standard gossip algorithm 
with uniform neighbor selection probability. Note that for 
random geometric graphs, standard gossiping with uniform 
neighbor selection has the same scaling behavior as with op- 
timal neighbor selection probabilities [4] , which ensures that 
the comparison is fair. 

Figures 0] through El illustrates how the cost of each al- 
gorithm behaves for various fields and network sizes. The 
error in the average estimation is measured by the normal- 
ized £2 norm ■ On the other axis we plot the 



11^(0)11 



total number of radio transmissions required to achieve the 
given accuracy. Figure |1| demonstrates how the estimation 
error behaves for a field that varies linearly across one axis 
of the unit square. In Figure|K| we use a field that is created 
by placing three temperature sources in the unit square and 
smooth the field by a simple process that models tempera- 
ture diffusion. Finally, in Figure 15] we use a field that is zero 
everywhere except in one node. For this field, the geographic 
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Figure 4: Estimation accuracy versus total spent 
energy for a linearly varying field. 
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Figure 5: Estimation accuracy versus total spent 
energy for a smooth field modeling temperature. 



gossip protocol significantly outperforms the standard gos- 
sip protocol as the network size and time increase, except 
for large estimation tolerances (e ~ 10~^) and few rounds. 

As would be expected, simple gossip is capable of com- 
puting local averages quite fast. Therefore, when the field 
is sufficiently smooth, or when the averages in local node 
neighborhoods are close to the global average, simple gos- 
sip might generate approximate estimates which are closer 
to the true average with a smaller number of transmissions. 
For these cases however, finding the global average will not 
be useful in the first place. In all our simulations, the energy 
gains obtained by using geographic gossip were significant 
and asymptotically increasing for larger network sizes as our 
theoretical results suggest. 

5. CONCLUSIONS 

In this paper we have proposed a novel gossiping algo- 
rithm for computing averages in networks in a completely 
distributed and robust way. Geographic gossip computes 
the averages faster than standard nearest neighbor gossip 
because it is using geographic knowledge to quickly dif- 
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Figure 6: Estimation accuracy versus total spent 
energy for a field which is zero everywhere except 
in one node. 



fuse information everywhere in the network. It is not hard 
to see that our algorithm is efficient for grids (computes 
the l/n°' average in O(n^'^logn) transmissions) and other 
topologies that realistically model wireless networks. Even 
if geographic routing cannot be performed, similar gossip al- 
gorithms can be used for any network that can support some 
form of routing to random nodes. Essentially, we can have 
nearest-neighbor gossip happening on the overlay network 
supported by random routing. 

The proposed algorithm can be used instead of nearest 
neighbor gossip in all the schemes that use consensus based 
aggregation and will greatly reduce the communication cost. 
For example [18, 19, 21] use similar ideas for localization, 
Kalman filtering and sensor fusion. In these schemes, ge- 
ographic gossip can be used instead of standard nearest- 
neighbor gossip to improve energy consumption. 
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